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There is considerable evidence that inhaled toxicants such as cigarette smoke can cause both irreversible changes to 
the genetic material (DNA mutations) and putatively reversible changes to the epigenetic landscape (changes in the 
DNA methylation and chromatin modification state). The diseases that are believed to involve genetic and epigenetic 
perturbations include lung cancer, chronic obstructive pulmonary disease (COPD), and cardiovascular disease (CVD), 
all of which are strongly linked epidemiologically to cigarette smoking. In this review, we highlight the significance 
of genomics and epigenomics in these major smoking-related diseases. We also summarize the in vitro and in vivo 
findings on the specific perturbations that smoke and its constituent compounds can inflict upon the genome, 
particularly on the pulmonary system. Finally, we review state-of-the-art genomics and new techniques such as 
high-throughput sequencing and genome-wide chromatin assays, rapidly evolving techniques which have allowed 
epigenetic changes to be characterized at the genome level. These techniques have the potential to significantly 
improve our understanding of the specific mechanisms by which exposure to environmental chemicals causes 
disease. Such mechanistic knowledge provides a variety of opportunities for enhanced product safety assessment 
and the discovery of novel therapeutic interventions. 
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Genomic and epigenomic perturbations 
produced by cigarette smoke 

Exposure to mutagens 

In man, the process of detoxification of tobacco carcinogens 
has two phases. Phase I xenobiotic metabolizing enzymes 
transform oxidizable substrates into electrophiles, and 
phase II-related enzymes attack oxidized substrates via 
nucleophilic reactions, providing an efficient process 
for xenobiotic metabolism and excretion. Accordingly, 
the genes encoding the enzymes involved in xenobiotic 
metabolism are up-regulated in the bronchial epithelium 
of smokers (Spira et al., 2004; Beane et al., 2011) and in 
smoke-exposed rodent lung tissue (Gebel et al., 2004; 
Gebel et al., 2006). Some metabolites formed during the 
detoxification process can be highly reactive and may create 
covalent adducts at guanidine and adenine bases. Adducts 
are normally eliminated through the DNA excision repair 
pathway, but if an adduct is still present during replication, 
the DNA polymerase may bypass such an altered base 
leading to a mutation (Pfeifer et al., 2002; Hang, 2010). 

The mutagenicity of tobacco smoke has been dem- 
onstrated in various assays including the mouse lym- 
phoma assay (MLA) (OECD, 1997). In a recent study, Guo 
et al. tested smoke condensate from 11 different cigarette 
brands and found that they all demonstrated dose-depen- 
dent mutagenic effects (Guo et al., 2011). Moreover, the 
tobacco-specific N-nitrosamine (TSNA), nicotine-derived 
4-(methylnitrosamino)- 1 -(3-pyridyl)- 1 -butanone (NNK) 
induces lung tumors in all tested species (mice, rat, ham- 
ster and mink) regardless of the route of administration. 

Some of these agents or their combinations may also 
induce genomic aberrations in smokers. Genetic lesions, 
such as loss-of-heterozygosity (LOH) and other chromo- 
somal aberrations are found in dysplastic, hyperplastic 
but also even histologically normal bronchial epithelium 
in clinically cancer-free smokers (Franklin et al., 1997; 
Mao et al., 1997; Wistuba et al., 1997; Nelson et al., 1998). 
There is a higher frequency of overall LOH in smoker 
lungs as compared to non-smokers, and among former 
smokers the LOH at 3pl4 (containing the tumor suppres- 
sor gene FHIT) is less frequent than in active smokers. 

Epigenomic changes induced by tobacco carcinogens 

While most compounds that are present in cigarette 
smoke and in various other environmental contaminants 
are non-mutagenic, some apparent non-mutagenic 
compounds can interfere with gene expression by means 
of epigenetic mechanisms. There is ample evidence for 



smoldng-induced epigenetic events in the clinical con- 
text. Hypermethylated promoters have been observed in 
clinically cancer-free and cancer-bearing smokers. These 
promoters can be detected in cell samples derived from 
sources as diverse as bronchoscopy, sputum, broncho- 
alveolar lavage fluid (BALF), alveolar macrophages, or 
lymphoblasts (Belinsky et al., 1998; Palmisano et al., 
2000; Lamy et al., 2002; Kim et al., 2004; Russo et al., 2005; 
Belinsky et al., 2005; Belinsky, 2005; Belinsky et al., 2006; 
Kerr et al., 2007; Baryshnikova et al., 2008; Monick et al., 
2012). Even though promoter hypermethylation can per- 
sist for many years after smoking cessation (Zochbauer- 
Miiller et al., 2003; Bhutani et al., 2008), current smokers 
in general, have a higher mean methylation index than 
former smokers, supporting the view that hypermethyl- 
ation, unlike somatic mutations, is reversible (Yanagawa 
et al., 2011). While the methylation of Retinoic Acid 
Receptor Beta (RARB), pl6, fragile histidine triad (FHIT), 
and RASSF1A (RAS association domain family 1A) corre- 
late wealdy with smoldng status, the degree of methylation 
increases with intensity of smoldng. 

The smoking effect has even been detected in cell-free 
DNA present in plasma. When methylation of at least one 
of Kij la, DCC, RARB, or NISCH was considered, meth- 
ylation frequency was smoking-dependent; while none 
of the light- or non-smoker controls showed plasma 
DNA methylation, the cumulative smoking dose (pack- 
years) correlated well with the methylation frequency in 
cancer-free heavy smokers (Ostrow et al., 2010). 

This frequent hypermethylation in smoker tissues has 
been explained by the high levels of DNA methyltrans- 
ferase 1 (DNMT1) that correlated with smoking status in 
lung tumor samples (Lin et al., 2010). In vitro and murine 
in vivo experiments have further shown that NNK acts 
through Akt signaling and inhibits DNMT1 protein deg- 
radation. Subsequently, DNMT1 protein accumulation 
leads to increased tumor suppressor gene hypermethyl- 
ation (Damiani et al., 2008; Lin et al., 2010; Liu et al., 2010). 

In an attempt to develop biomarkers of exposure, 
recent work has identified regions of altered DNA methyl- 
ation in the lungs of SENCAR (SENsitive to CARcinogens) 
mice exposed to a single dose of 7,12-dimethylbenz[a] 
anthracene (DMBA) with or without cigarette smoke 
(Phillips and Goodman, 2009). The aberrant methylation 
was detected at very early time points, before any obvious 
lung histopathology. Based on their results, Phillips and 
Goodman (2009) suggested that regions of altered DNA 
methylation could serve as both biomarkers of exposure 
and effect. 
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Numerous in vitro studies have attempted to reproduce 
smoke-induced epigenomic changes observed in the lung 
of a smoker (Mass and Wang, 1997; Liu et al., 2007; Liu 
et al., 2010). A recent study has shown that a gene-specific 
promoter methylation is induced in immortalized lung 
epithelial cells after prolonged (several months) exposure 
to cigarette smoke condensate (Liu et al., 2010). Currently, 
it is not known whether the methylation is reversible and 
what would be the time frame for demethylation events 
after removal from smoke exposure. Reversible smoke 
effects on DNA methylation have been documented in 
cultured lung cancer cells. In A549 cells, the pro-meta- 
static oncogene synuclein-y (SNCG) is silenced by CpG 
island (genomic region that contains a high frequency of 
CG-dimer) methylation. Cigarette smoke extract (CSE)- 
induced SNCG demethylation accompanied by gene 
overexpression in just 3 days of treatment. The demethyl- 
ation was associated with a twofold decrease of DNMT3B 
mRNA. Withdrawal of the treatment resulted in the recov- 
ery of DNMT3B expression and concomitant re-establish- 
ment of SNCG CpG methylation (Liu et al., 2007). 

Lung cancer 

The various forms of lung cancer display phenotypically 
diverse cell types and according to the clonal evolution 
model, cancer arises from a single cell, the extent of the 
cumulative genomic instability dictating tumor progres- 
sion (Nowell, 1976). More recent evidence suggests that 
within a tumor, only a small population of cells has a self- 
renewing capacity making them able to drive malignant 
growth (Eramo et al., 2008). According to this cancer 
stem cell theory (reviewed in Rivera et al., 2011), either 
restricted progenitors or more differentiated lung cells 
could convert to cancer stem cells with self-renewing 
capacity. Such cell populations have already been identi- 
fied for both small cell (SCLC) and non-small cell lung 
cancer (NSCLC) (Eramo et al., 2008). 

Lung cancer development involves various genomic 
perturbations, such as point mutations, deletions, and 
gene amplifications. The short arm of chromosome 3 
contains several candidate tumor suppressor genes, 
and deletions within this region are detected in nearly 
100% of lung cancers (Zabarovsky et al., 2002). The 
appearance of activating mutations in dominant onco- 
genes, or inactivating mutations in recessive tumor 
suppressor genes, as well as chromosomal aberrations 
are generally regarded as late events in tumor develop- 
ment. However, in support of the field cancerization 
theory (Steiling et al., 2008; Gazdar and Minna, 2009), 
some mutations have been described in non-malignant 
lung tissue adjacent to resected lung tumors (Nelson 
et al, 1998; Zabarovsky et al., 2002). 

A recent study has confirmed that cigarette smoking 
is linked to specific copy number alterations in NSCLC. 
Analyses of early stage tumors showed that heavy smok- 
ers (>60 pack-years) have more copy number gains than 
light and non-smokers. These gains were found to be 



predominantly affecting oncogenes and genes associ- 
ated with tumor growth. Interestingly, copy number 
losses located in intergenic regions were found in light 
and non-smoker tumors, a phenomenon not seen in 
tumors from heavy smokers (Huang et al., 2011). 

Even though adenocarcinoma is becoming the domi- 
nant lung cancer type in smokers today, it presents a vari- 
able, distinct subtype that is also found in non-smokers 
(Subramanian and Govindan, 2010). However, lung 
tumors found in non-smokers are genetically different 
from those found in smokers (Sun et al., 2007). While 
mutations in the GTPase gene encoding KRAS and the 
tumor suppressor gene encoding p53 are more specific 
to smoking-related lung adenocarcinoma, the sequence 
encoding epidermal growth factor receptor (EGFR) kinase 
domain often becomes mutated very early in the course 
of smoking-independent lung adenocarcinoma develop- 
ment (Herbst et al., 2008; Lantuejoul et al., 2009; Dacic et 
al., 2010; Broet et al., 2011). KRAS and EGFR mutations 
have been found to be mutually exclusive, and there was 
an inverse correlation between smoking status (number 
of pack-years and duration of smoking) and the frequency 
of EGFR mutations (Yanagawa et al., 2011). It has been 
proposed, supported by the positive correlation between 
smoke-free years and the occurrence of EGFR mutations, 
that tobacco smoking suppresses EGFR mutation (Garinis 
et al., 2001; Dacic et al., 2010; Lee et al., 2010; Broet et al., 
2011; Yanagawa et al., 2011). An inverse correlation has 
been shown between the methylation of RASSF1A, and 
FHIT and runt- related transcription factor 3 (RUNX3) ver- 
sus EGFR mutation (Yanagawa et al, 2011). 

The gene encoding tumor suppressor p53 is com- 
monly mutated in cancers, and contrary to the EGFR 
mutation, p53 mutations occur more frequently in smok- 
ers than in non-smokers (Ryberg et al., 1994; Garinis 
et al., 2001; Pfeifer et al., 2002). Anna et al. showed that 
among lung cancer patients, prolonged duration of 
smoking increased the frequency of p53 mutations. 
While the mutation frequency was 14.3% in smokers of 
less than 20 years, half of those who had smoked for more 
than 20 years carried the mutation (Garinis et al., 2001; 
Anna et al., 2009). The occurrence of p53 mutations are 
not random and several 'hot-spots' were observed within 
the region that encodes the DNA-binding domain of 
p53 (Denissenko et al., 1996). In the subsequent study, 
it was demonstrated that the locations of the mutation 
"hot-spots" correlated with the methylation status of the 
proximal CpG dinucleotides (Denissenko et al., 1997). 

The Tumor Sequencing Project (TSP) studied 188 
lung adenocarcinomas and identified a group of somatic 
mutations that might affect the key pathways in adeno- 
carcinoma development. The mutated genes that were 
found included formerly identified tumor suppressors, 
such as p53, CDKN2A, and STK11 and oncogenes such as 
KRAS, EGFR, and NRAS. Additionally, several additional 
mutated genes were discovered, including putative tumor 
suppressor genes, such as ATM, NF1 , RBI , and APC as well 
as putative proto-oncogenes, such as ERBB4, KDR, FGFR4 
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and NTRK. Many of these genes were not only mutated, 
but the identified loci had also undergone copy number 
and/or gene expression changes. It was concluded that 
even though lung adenocarcinomas are highly heteroge- 
neous, the pathways that are affected are most likely the 
same among the majority of the different subtypes. The 
dominant pathways that were affected included those of 
MAPK, p53, Wnt, cell cycle and mTOR (Ding et al., 2008). 

In cancer biology, the most relevant, or at least the 
best understood, epigenetic mechanism refers to DNA 
hypermethylation of specific loci (CpG islands) in the 
promoter region of known and presumptive tumor sup- 
pressor genes which is a common hallmark of human 
tumors in general (Esteller, 2007), and of lung cancer in 
particular (Zochbauer-Muller et al., 2002; Belinsky, 2005). 
Importantly, there is now strong evidence that aberrant 
promoter methylation and the consequent silencing of 
tumor suppressor genes is a critical step during tumor 
development, and is frequently detected in so-called 
precursor (benign) lesions (Belinsky et al., 1998). Thus, 
aberrant promoter methylation is regarded as an early 
event in the multistep process of carcinogenesis. 

Genes 'targeted' by promoter hypermethylation 
encompass a broad functional spectrum including some 
involved in cell cycle arrest such as pl6 (major target) and 
p53, DNA repair such as MGMT (0 6 -methylguanine-DNA 
methyltransferase), or growth arrest such as RASSF1A. 
Importantly, although it requires at least 5 years of follow- 
up, there is a strong indication that clinically cancer-free 
chronic smokers with hypermethylated promoters in 
actual and presumptive tumor suppressor genes are 
at an elevated risk of developing lung cancer (Belinsky 
et al, 1998; Palmisano et al., 2000; Jarmalaite et al., 2003; 
Baryshnikova et al., 2008; Feng et al., 2008). The frequency 
of pl6 hypermethylation, which is considered one of the 
prime aberrations in lung cancer development, increases 
with disease progression (Belinsky et al., 1998; Belinsky, 
2005). In fact, pl6 is frequently fully inactivated in lung 
cancer via allele-loss and inactivation of the remaining 
allele by homozygous deletion, hypermethylation, or 
mutation (Cairns et al., 1995; Merlo et al., 1995; Rusin et al., 
1996). The consequence of pi 6 inactivation is an increased 
cyclin D expression, which leads to persistent hyper-phos- 
phorylation of retinoblastoma protein (Rb) with resultant 
evasion of cell cycle arrest (Gautschi et al., 2007). 

Though 'targeted' hypermethylation is more recog- 
nized in cancer development, global hypo-methylation 
is almost always linked to malignancy; in normal human 
cells, more than twice as many cytosine residues are 
methylated compared to tumor cell DNA. Nearly half of 
the genome consists of highly repeated DNA sequences 
that are accountable for the global hypomethylation, a 
common trait in cancers, which can result in increased 
possibility of chromosomal rearrangements (Dunn, 
2003). Activation of proto-oncogenes, including c-myc, 
c-fos, and c-Ha-ras, which upon demethylation become 
abnormally expressed, set off uncontrolled cellular pro- 
liferation. Finally, it is possible that inserted pro-viral 



sequences (silenced by methylation) become demethyl- 
ated, which can lead to reactivation and increased infec- 
tivity (Dunn, 2003; Schar and Fritsch, 2011). 

Tobacco smoke-induced methylation and resulting 
loss of gene expression has been detected in cell lines 
derived from cigarette smoke- or tobacco carcinogen- 
induced mouse lung tumors that showed reduced 
expression of death associated protein (DAP) kinase. The 
loss of expression correlated with the promoter methyla- 
tion state and could be reversed by treatment with 5-aza- 
2'-deoxycytidine, a compound that inhibits DNA methyl 
transferase. The aberrant DAP kinase methylation was 
linked to the earliest pre-neoplastic state of lung adeno- 
carcinoma, and the authors hypothesized that silencing 
of a crucial component of several apoptotic pathways 
could allow the expansion of the malignant cell colony 
leading to lung carcinogenesis (Pulling et al., 2004). 

The methylated-CpG island recovery assay has iden- 
tified genes that are hypermethylated in lung cancer 
cells. The data were used to compare the methylation 
status of A549 lung cancer cells relative to normal human 
bronchial epithelial cells. Judged by the location of the 
differentially methylated CpG islands (close to the ends 
of known or predicted genes or to the exons/introns 
within genes), it was hypothesized that the elements 
have a regulatory function. Notably, one fifth of the top 
50 genes that were identified in the screen belong to the 
homeobox gene family (LHX2, LHX4, PAX7,HOXB13, 
LBX1, SIX2, HOXD3, DLX1, HOXD1, ONECUT2, and 
PAX9) (Rauch et al., 2006). In addition to gene specific 
methylation studies, a genome-wide screen was carried 
out to identify new cancer-specific methylation markers. 
The study included normal and malignant cell lines and 
relied on 5-aza-2'-deoxycytidine treatment as the read- 
out for demethylation and increased gene expression. 
The candidates for cancer cell-specific promoter methyl- 
ation sites were compared to those found in primary epi- 
thelial tumors to establish novel malignancy-associated 
methylation markers (Shames et al., 2006). An extensive 
depiction of methylation events and cancer is provided 
by the MethyCancer database, which documents the 
interplay between DNA methylation, gene expression, 
and cancer (He et al., 2008). 

DNA methylation has also been detected in plasma. A 
panel of six genes (APC, CDH1, MGMT, DCC, RASSF1A, 
and AIM1) was used to compare serum and tumor DNA 
methylation profiles: if a given gene was methylated in 
serum, it was always also methylated in the tumor tissue. 
By contrast, tumor methylation was not always mirrored 
in the serum sample (Begum et al., 2011). A progressive 
rise in plasma DNA methylation has also been detected 
in samples from patients with malignant lung tumors as 
compared to subjects with normal computer tomography 
(CT) scan (Ostrow et al., 2010). These studies are encour- 
aging non-invasive approaches to advance the clinical 
management of lung cancer to distinguish between can- 
cerous and noncancerous abnormal CT findings. Such 
methylation markers could be followed early on in heavy 
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smokers and the individuals harboring aberrant meth- 
ylation could be subjected to more intensive screening, 
resulting in earlier tumor detection and improved prog- 
nosis. Moreover, as Dolinoy et al. have proposed, novel 
therapeutic intervention strategies targeting reversible 
unfavorable epigenetic modifications could be devel- 
oped (Dolinoy and Jirtle, 2008). 

It has been shown that there is increased methylation 
for certain individual genes as well as the overall number 
of methylated genes with the progression of pre-neoplas- 
tic lesion towards invasive adenocarcinoma (Licchesi 
et al., 2008; Chung et al., 2011). Selamat et al. have fur- 
ther mapped early, intermediate, and late methylation 
changes occurring throughout the development from 
histologically normal lung tissue (adjacent to the tumor) 
to atypical adenomatous hyperplasia, further to adeno- 
carcinoma in situ and invasive lung adenocarcinoma. 
Unfortunately, the number of samples with smoking 
information was small and no correlation could be drawn 
between smoking status and DNA methylation levels in 
the tissue samples examined. Aberrant hypomethylation 
was evident only in the invasive carcinoma samples, 
indicating that unlike promoter hypermethylation, hypo- 
methylation is a late event during lung adenocarcinoma 
development (Selamat et al., 2011). This is in line with a 
study demonstrating that global DNA hypomethylation 
is highly associated with tumor progression (Anisowicz 
et al., 2008). While global hypomethylation, being a late 
event in the course of disease, may not directly bring 
value to disease prediction, this association is impor- 
tant for understanding the tumor biology and may have 
implications in therapeutic intervention. 

Clearly, epigenetics plays a major role in tumor pro- 
gression; however, it might also be involved in the earlier 
stages of cancer development. According to the previ- 
ously described cancer stem cell theory, the capacity to 
self-renew could be the result of abnormal activation of 
developmental pathways in cells destined for differentia- 
tion (Rivera et al., 2011). It is possible that such pathway 
activation involves epigenetic gene regulation, which 
evidently contributes to the cell's ability to express or 
silence diverse genes during development. 

Both in vitro and genetic studies have documented the 
involvement of the histone code in lung carcinogenesis 
(Peters et al., 2001; Shan et al., 2008; Haberland et al., 
2009; Chi et al., 2010). Histone deacetylase 6 (HDAC6) 
is required for the epithelial-mesenchymal transition 
of TGF-cc SMAD3 signaling-mediated lung cancer cell 
metastasis (Shan et al., 2008). Tumor cell lines engi- 
neered to have conditional HDAC alleles have shown 
that HDAC1 and HDAC2 have redundant functions, and 
while they are expendable in post-mitotic cells, at least 
one of them is required for tumor cell survival (Haberland 
et al., 2009). Mutations and deregulation in histone lysine 
demethylases (HDMs) have been found in many types of 
cancers (Chi et al., 2010). The perturbed patterns of his- 
tone tail methylations are associated with increased chro- 
mosomal instabilities and tumor risk, and this applies to 



both 'writing' and 'erasing' of the methylation code. For 
example, the lack of histone H3 lysine 9 methylation in 
histone methyltransferase (£fM7ase)-deficient mice leads 
to impaired genomic stability (Peters et al., 2001). 

COPD 

As lung cancer and COPD share many characteristics 
in the early stages of disease development, and since 
each condition increases the susceptibility to the other, 
it is reasonable to assume that they may be affected by 
similar genomic perturbations (Sundar et al., 2011; Yang 
etal., 2011). 

In a recent study, white blood cell DNA isolated from 
subjects with well-defined COPD (using spirometry) was 
used for an array-based DNA methylation screen. The 
association of 349 CpG sites with the severity of COPD 
was reproduced in two family-based cohorts, and sev- 
eral of the 330 genes affected were related to immune 
and inflammatory processes. A significant association 
was found between SERPINA1 (the gene encoding a - 
antitrypsin), hypomethylation, and low lung function 
(Qiu etal., 2012). 

Cigarette smoke has considerable oxidative stress 
potential, leading to an imbalance between histone 
acetylation and deacetylation. This may account for 
the enhanced expression of inflammatory mediators, 
which in turn leads to amplified pulmonary inflam- 
mation (Rahman et al., 2002; Rahman, 2002). Cigarette 
smokers have increased acetylation of histones H3 and 
H4 (Szulakowski et al., 2006), and in C57BL/6J mice 
cigarette smoke exposure increases the levels of acety- 
lated histones H3 and H4 on lysine 9 after just 3 days 
of exposure (Yang et al., 2008). In rat lung, cigarette 
smoke results in altered histone acetylation, and in 
analogy to COPD, the resulting excessive release of pro- 
inflammatory cytokines was insensitive to glucocorticoid 
treatment (Marwick et al., 2004). Mechanistically, the 
abnormal histone acetylation seems to be due to ciga- 
rette smoke-induced phosphorylation and subsequent 
ubiquitination and proteosomal degradation of HDAC2 
as documented by both in vitro (macrophages, human 
bronchial and primary small airway epithelial cells) and 
in vivo (mouse lung) smoke exposure models (Adenuga 
et al., 2009; Adenuga and Rahman, 2010). COPD patients 
have reduced HDAC activity measured in the peripheral 
lung tissue, alveolar macrophages, and bronchial biopsy 
specimens (Ito et al., 2005). Lung tissue and peripheral 
blood mononuclear cells (PBMC) from COPD patients 
were also found to have reduced levels of Sirtuin 1, a type 
III HDAC that is also classified as an anti-aging molecule 
(Rajendrasozhan et al., 2008; Ito and Barnes, 2009). In 
agreement with the reduced HDAC levels and activity, 
it has been shown that the acetylation of histones H2A, 
H2B, H3, and H4 is increased in the lungs and alveolar 
macrophages of COPD patients (Chen et al., 2008). 

Szulakowski et al. reported a correlation between COPD 
severity and decreased HDAC2 levels in the cytoplasm, 
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and between both cytoplasmic and nuclear HDAC2 with 
the reduced lung function (Szulakowski et al., 2006). 
Oxidative stress is one of the major causative factors for 
COPD and most likely leads to the observed degradation 
of HDAC2 with consequent increase in the expression of 
pro-inflammatory cytokines. It has been proposed that the 
inhibition of HDAC2 activity also contributes to the gluco- 
corticoid resistance seen in COPD inflammation (Barnes 
et al., 2005; Barnes, 2006; Barnes, 2009). 

CVD 

From the many risk factors for cardiovascular disease 
(CVD), smoking is causally linked to both disease onset 
and progression. Vascular remodeling, characteristic of 
CVD, involves chronic inflammation and the release of 
various cytokines and chemokines, and largely establish 
the initiation and progress of an atherosclerotic lesion 
(Wierda et al., 2010). Lung inflammation, the major driver 
of both lung cancer and COPD, has also been linked to 
CVD. It is believed that when lung inflammation, or even 
acute lung injury, turns systemic, it further stimulates 
events that lead to the activation of the vascular endothe- 
lium, heart attack, and stroke (Van Eeden et al., 2012). 

Direct DNA damage has been linked to cardiovascular 
disease; the increase of micronuclei, which is indicative 
of DNA damage, correlates with the severity of athero- 
sclerosis. Genomic instability, in the forms of LOH and 
microsatellite instability (MI), has been documented in 
smooth muscle cells of human plaques. The loci involved 
include the TGF-1 receptor (MI), mismatch repair genes 
(LOH), and nitric oxide synthase (LOH). Currently, it is 
not clear what the source of DNA damage is in the dis- 
ease, or whether the damage is a cause or a consequence 
of disease progression (Andreassi and Botto, 2003). 

Several independent studies have emerged showing 
that DNA methylation is an important aspect in CVD 
pathology. Analyses of peripheral lymphocytes have 
indicated that DNA from subjects with angiographically 
confirmed coronary artery disease (CAD) had under- 
gone more extensive genomic methylation than that of 
healthy controls. There was also a significant positive 
correlation of global DNA methylation with plasma 
homocysteine levels, which is an independent risk fac- 
tor in CAD patients (Sharma et al., 2008). Kim et al. have 
probed global genomic DNA methylation within ALU 
and Satellite 2 (AS) repetitive elements in peripheral 
blood lymphocytes (PBL) as a population based CVD 
risk assessment. Their study showed a positive correla- 
tion between PBL DNA methylation and prevalence of 
CVD or its risk factors, more pronounced in men than 
women (Kim et al., 2010). Finally, in the apolipoprotein 
E (apoE) mutant mouse, an animal model for CVD, the 
epigenetic changes in PBMCs DNA were found within 
coding sequences as well as repeated interspersed 
sequences. Strikingly, the altered methylation pattern 
could be detected before any noticeable atherosclerotic 
lesions were present, suggesting that DNA methylation 



plays a causative role during the course of CVD develop- 
ment (Lund et al., 2004). Epigenetic gene regulation has 
been reported to occur in platelets from smokers, in a 
study of monoamine oxidase-B (MAOB) (Launay et al., 
2009). These authors reported that "the methylation fre- 
quency of the MAOB gene promoter was markedly lower 
in smokers than in non-smokers, due to cigarette smoke- 
induced increase of nucleic acid demethylases activity" 

In addition to DNA methylation, vascular homeostasis 
and atherosclerosis biology depend on the histone code. 
Whilst some functions mediated by HDACs are cyto- 
plasmic (i.e. not associated with chromatin), chromatin 
immunoprecipitation has shown that the FGF2 promoter 
is bound by HDAC5. The promoter is thus presumed to be 
repressed by HDAC5 leading to transcriptional inactivity 
(Zhou et al., 201 1). Post-translational histone modifications 
have also been involved in endothelial nitric oxide synthase 
4 (eNOS4) regulation in endothelial cells, an important 
feature of vascular biology. It has been proposed that his- 
tone modifications are essential in maintaining the eNOS4 
expression and that erasing the histone mark results in the 
hypoxic repression of the eNOS4 gene (Fish et al., 2010). 

DNA methylation signatures 

As DNA methylation plays a critical role in many different 
cellular processes including chronic inflammation, which 
lays a foundation to many smoking-related diseases, such 
as COPD, lung cancer, and CVD, the whole genome meth- 
ylation signatures could provide valuable information on 
possible perturbations as a result of cigarette smoking. 
DNA methylation signatures, and the changes thereof, 
complement the conventional gene expression profiling 
in assessing the extent of damage that smoking causes to 
different organs. Li et al. have carried out a whole-genome 
DNA methylation analysis on human PBMCs and dem- 
onstrated 20 genomic features of regulatory, coding, 
non-coding, RNA-coding, and repeat sequences with a 
distinct methylation signature. There was a considerable 
inverse correlation between the allele-specific methyla- 
tion (ASM) and allele-specific gene expression (ASE). As 
PBMCs represent a significant, non-invasive sample 
source, risk assessment could benefit considerably from 
analysis of the PBMC DNA methylome (Li et al., 2010). 
Similar to gene expression analyses, the challenge is to 
categorize the individual methylated genes in a mean- 
ingful way to build a predictive classifier. Assessment of 
network perturbation amplitude by applying systems 
biology data to causal biological networks could identify 
specific signatures beyond single genes for each of the 
many smoking-related diseases (Martin et al., 2012). 

The methylation analyses should be explored as a 
method to improve stratification of subjects in addition to 
gene expression. Most likely the gene expression will be 
inversely correlated to the methylation levels. Furthermore, 
this work may pinpoint "interchromosomal networks" 
(chromosomal regions) that are subject to joint epigenetic 
mechanisms regulating gene expression (Zhao et al., 2006). 
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A variety of DNA methylation biomarkers, have been devel- 
oped in the field of tumor classification as well as disease 
and therapy prognosis (Sandoval and Esteller, 2012). 

Next generation genomics and 
epigenomics platforms 

As reviewed above, a large body of information exists 
on different genomic and epigenetic changes that are 
caused by smoking and other xenobiotic compounds. It 
is, however, possible to obtain an even more comprehen- 
sive picture on these perturbations by employing new 
and advanced technologies. Pleasance et al. (2010) have 
assessed the genome-wide mutational load extracted from 
an SCLC cell line by a massively parallel sequencing tech- 
nology. The authors showed a comprehensive, somatically 
acquired, mutational profile including base substitutions, 
insertions, and deletions, as well as copy number changes 
and genomic rearrangements. The 23,000 SCLC muta- 
tions could be classified into distinct mutation signatures. 
Overall, these remarkable mutational patterns observed 
in the SCLC genome highlight the utility of genome-wide 
analyses to obtain true DNA signatures associated with 
lung cancer and carcinogen exposure. The patterns would 
not have been identified by simply sequencing limited 
genomic regions (Pleasance et al, 2010). 

Several parallel sequencing technologies including 
pyro-sequencing, fluorescence-based sequencing-by- 
synthesis and sequencing-by-ligation, ion semiconductor 
and single molecule real time (SMRT) sequencing, have 
been developed in the last decade, offering read lengths 
and throughputs ranging from about 75,000 single- end 
long reads (~ 1 kb) to 6 billion 2 x 100 bp paired-end reads 
(Supplementary Material; Niedringhaus et al., 2011). 

High-throughput sequencing 
application in genomics 

Regardless of the technology used, high-throughput 
sequencing has applications in genomics and epigenom- 
ics. Apart from its use in de-novo and re-sequencing of 
whole genomes, deep sequencing of a genome (DNA-seq) 
enables a wide range of analysis. Structural variations, 
copy number variations, single nucleotide polymor- 
phisms (SNPs) and small insertions or deletions can be 
identified based on the comparison of a re-sequenced 
genome to a reference genome. If genomic regions of 
interest are known, targeted re-sequencing can be used 
to reduce costs, complexity and time. 

High-throughput sequencing application in 
epigenomics 

Whole genome DNA methylation can be investigated 
either at a single nucleotide level using shotgun bisulfite 
sequencing (MethylC-seq) and reduced representation 
bisulfite sequencing (RRBS), or at the level of a few tens 
of nucleotides with methylated DNA immunoprecipita- 
tion (MeDIP-seq), methylated DNA capture by affinity 
purification (MethylCap-seq), methylated DNA binding 
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domain sequencing (MBD-seq) or methylation-sensitive 
restriction enzyme sequencing (MRE-seq) (Bock et al., 
2010; Harris et al., 2010). MethylC-seq and RRBS require 
the sequencing of both the untreated and the bisulfite 
treated genomes in order to compare them and identify 
methylated cytosines. With MeDIP-seq, MethylCap-seq 
and MBD-seq, methylated genomic DNA fragments are 
enriched before sequencing. On the contrary, MRE-seq 
enriches unmethylated genomic DNA fragments. These 
methods reduce the amount of sequencing necessary, 
but do not provide single nucleotide accuracy. 

Histone modifications can be studied using chromatin 
immunoprecipitation (ChIP) with antibodies specific for 
a given histone modification. After precipitation, the DNA 
bound to the precipitated chromatin can be hybridized on 
a microarray chip (ChlP-chip) or sequenced (ChlP-seq) on 
a high-throughput platform. ChlP-chip will require a large 
number of arrays to cover whole mammalian genomes, 
while ChlP-seq provides the whole genome coverage even 
at a low sequencing depth (Barski et al., 2007). 

With the development of platforms based on high- 
throughput sequencing, genomics and epigenomics 
analyses are freed from the requirement for prior knowl- 
edge inherent to array-based platforms. The platforms 
thus move closer to measuring what is in the sample 
rather than what is on the chip. Furthermore, as high- 
throughput sequencing directly measures the DNA 
present in a sample, it can capture several modifications 
and alterations at once. The data generated by these 
platforms are therefore more flexible in the way they can 
be analysed and provide wider spectrum of genomic 
information. Computational resources and methodolo- 
gies for the analysis of large-scale epigenomics datasets 
have been developed in recent year, however, as noted 
by Huss, "custom tools are needed to optimally ana- 
lyze ChlP-seq data on histone modification and BS-seq 
data on DNA methylation" (Huss, 2010). Furthermore, 
reaching the full potential of whole genome epigenom- 
ics studies is still hindered by the lack of guidelines and 
repositories for data submission. This would speed up 
research and allow for advanced applications such as 
therapeutic interventions (Ongenaert, 2010). 

These next generation technology platforms should be 
preferred methodologies to execute DNA and chromatin 
bound analyses. They have proven to be powerful tools, 
and are clearly the future means of identifying genetic 
and epigenetic aberrations triggered by compounds in 
tobacco smoke. In their recent review, Pfeifer et al. have 
given this assertion some perspective: "the analysis of 
a single cancer genome may generate almost twice as 
much mutation data as the whole literature on sequenc- 
ing the p53 gene accumulated during 20 years" (Pfeifer 
and Hainaut, 2011). 

Discussion 

Better understanding of the mechanisms for the major 
smoking-related disease was the main aim of a recent 
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report (U.S. Department of Health and Human Services, 
2010); however, the report was quite limited on how this 
understanding might actually be gained. Product safety 
assessment through clinical and epidemiological stud- 
ies, generally decades after the initial exposure, rarely 
explains the mechanisms that link biological perturba- 
tions to a given exposure. In contrast, predictive systems 
biology approaches offer a valuable means for the assess- 
ment of biological perturbations before the phenotypic 
outcomes manifest as well as for pinning down the 
mechanisms involved in the body's response to poten- 
tially harmful substances. Global measurements from 
in vitro cellular and in vivo animal experimental models 
can be causally linked to a biological network and this 
offers immense power to identify important pathways 
and targets for intervention (Hoeng et al., 2012). 

A major component of systems biology is to identify 
the gene-related network perturbations caused by xeno- 
biotics. For example, a constitutively increased growth 
pattern in gene expression profiling may stem from an 
activating mutation in a cell growth signaling gene such 
as KRAS or EGFR. While both effects may provide similar 
pictures at gene expression level, their impact on disease 
progression is clearly different. The integration of epi- 
genetics and genomics to the systems biology approach 
will lead to the identification of a full panel of biomark- 
ers and to a more reliable depiction of DNA damages 
caused by cigarette smoke. As some of the epigenetic and 
genomic changes, e.g. DNA methylation, can be revers- 
ible, their targeting has been considered for therapies. 
In this context, it is of interest to also study these effects 
upon smoking cessation to determine how many of the 
changes presented above reverse completely upon ces- 
sation, and how long does the reversal process take. 

Given the very rapid increase in knowledge on 
genomic effects of cigarette smoke, our recommenda- 
tion is that this research be continued to definitively 
determine pathways (i.e. mechanisms) by which smoke 
actually causes diseases such as those describe here. This 
genomic investigation will spur the formulation of new 
testable hypotheses and present mechanistic insight on 
how exposure to chemicals and mixtures are related to 
disease onset and progression, thus providing the basis 
for biomarkers selection and justification. It might then 
be possible to identify specific chemicals (or classes of 
chemicals) in the smoke that would be linked to specific 
diseases, a long-term goal (Wynder, 1980) which has so 
far received only minimal success. Research could then 
be pursued on reducing or eliminating those chemicals 
in smoke, which have been identified by systems biology 
as necessary to substantially perturb disease networks 
as opposed to the current assumption that 'harmful and 
potentially harmful constituents' (U.S. Department of 
Health and Human Services, 2011) of the smoke matrix 
can be reliably identified by traditional toxicological or 
epidemiological studies. 

Our approach may be more robust than the simplistic 
application of genetic toxicology testing of either whole 



smoke or of cigarette smoke condensate. Mutagenicity 
testing of highly complex mixtures seems scientifically 
irrational, but it is an approach that is often used when 
comparing cigarette types (Demarini et al., 2008) and 
smokeless products (Johnson et al., 2009). It might also 
be possible to take the now- established mechanisms for 
the different diseases in humans, and see whether these 
mechanisms also exist (or could be improved) in animal 
species that have been used as surrogates for humans 
(Schleef, 2006), with the aim of building better animal 
models (Coggins, 2010). Transgenic strains of laboratory 
animals are becoming increasingly more popular and 
seemingly limitless in scope. 

It is very likely that multiple redundant pathways 
exist for each of the disease states we have reviewed. The 
adoption of a systems biology approach relying on results 
from high-throughput sequencing genomics studies 
may therefore be a significant contribution enabling the 
elucidation of these pathways and of their interplay. We 
propose that the use of systems biology can go beyond 
just toxicological assessment, and can be applied in other 
areas such as drug development, pharmacology, and 
personalized medicine. These measurements should 
be considered in order to gain better understanding of 
the mechanism of actions of cigarette smoke on disease 
development as also indicated by Hammons et al. in 
their recent review (Hammons and Lyn-Cook, 2011). 
Furthermore, the genomic and epigenetic mechanisms 
may serve as causal links between exposure, dose and 
duration. Recently, Hou et al. summarized how expo- 
sure to environmental chemicals can cause epigenetic 
changes in a dose-dependent manner (Hou et al., 2012). 
Moreover, the initiation of biological perturbations result- 
ing in malignancy by certain chemical compounds can be 
explained by epigenetic mechanisms (Stein, 2012). 

Conclusion 

The limitations of the traditional risk assessments include 
the lack of mechanistic information, uncertainty about 
the toxicity pathways that might potentially be affected, 
relevance of the endpoints of toxicity, default assump- 
tions for dose-response extrapolations, and uncertainly 
about intra- and inter-species "safety factors" (Barlow 
et al., 2006; Edwards and Preston, 2008; Hartung, 2009). 
In this review, we have focused on the genomic and epig- 
enomic perturbations caused by tobacco smoke and how 
the latest technological developments enable the research 
community to investigate them. In general, toxicogenom- 
ics is not yet fully accepted in the toxicological sciences, 
due to regulatory concerns and the lack of published proof 
of concept studies (Mendrick, 2008). All these aspects are 
currently being addressed and the advances in molecular 
system biology unveils the multifaceted nature of disease 
processes. The toxicology community should incorporate 
this emerging knowledge, whereby it can take an iterative 
approach of examining proof of concepts, assessing the 
value of new information, and developing decision rules. 
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Programs such as NexGen (http://www.epa.gov/risk/nex- 
gen/index.htm) from the US Environmental Protection 
Agency guide the community in this direction. 

For a complete understanding of the effects of these 
perturbations, all components of network biology should 
be included (Figure 1). This allows the mechanistic 
understanding of the biological responses that attempt to 
protect the organism from potentially harmful substances 
and the identification of predictive biomarkers for disease 
onset. Ultimately, the combined quantifiable genomic 
and epigenomic perturbations will allow us to measure 
the genomic impact of a substance on a biological system. 
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